1,435 research outputs found

    Steered mixture-of-experts for light field images and video : representation and coding

    Get PDF
    Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution

    A multi-configuration part-based person detector

    Full text link
    Proceedings of the Special Session on Multimodal Security and Surveillance Analytics 2014, held during the International Conference on Signal Processing and Multimedia Applications (SIGMAP 2014) in ViennaPeople detection is a task that has generated a great interest in the computer vision and specially in the surveillance community. One of the main problems of this task in crowded scenarios is the high number of occlusions deriving from persons appearing in groups. In this paper, we address this problem by combining individual body part detectors in a statistical driven way in order to be able to detect persons even in case of failure of any detection of the body parts, i.e., we propose a generic scheme to deal with partial occlusions. We demonstrate the validity of our approach and compare it with other state of the art approaches on several public datasets. In our experiments we consider sequences with different complexities in terms of occupation and therefore with different number of people present in the scene, in order to highlight the benefits and difficulties of the approaches considered for evaluation. The results show that our approach improves the results provided by state of the art approaches specially in the case of crowded scenesThis work has been done while visiting the Communication Systems Group at the Technische UniversitĂ€t Berlin (Germany) under the supervision of Prof. Dr.-Ing. Thomas Sikora. This work has been partially supported by the Universidad AutÂŽonoma de Madrid (“Programa propio de ayudas para estancias breves en España y extranjero para Personal Docente e Investigador en FormaciĂłn de la UAM”), by the Spanish Government (TEC2011-25995 EventVideo) and by the European Community’s FP7 under grant agreement number 261776 (MOSAIC)

    Fast structural changes (200–900 ns) may prepare the photosynthetic manganese complex for oxidation by the adjacent tyrosine radical

    Get PDF
    The Mn complex of photosystem II (PSII) cycles through 4 semi-stable states (S0 to S3). Laser-flash excitation of PSII in the S2 or S3 state induces processes with time constants around 350 ns, which have been assigned previously to energetic relaxation of the oxidized tyrosine (YZox). Herein we report monitoring of these processes in the time domain of hundreds of nanoseconds by photoacoustic (or ‘optoacoustic’) experiments involving pressure-wave detection after excitation of PSII membrane particles by ns- laser flashes. We find that specifically for excitation of PSII in the S2 state, nuclear rearrangements are induced which amount to a contraction of PSII by at least 30 Å3 (time constant of 350 ns at 25 °C; activation energy of 285 +/− 50 meV). In the S3 state, the 350-ns-contraction is about 5 times smaller whereas in S0 and S1, no volume changes are detectable in this time domain. It is proposed that the classical S2 = > S3 transition of the Mn complex is a multi-step process. The first step after YZox formation involves a fast nuclear rearrangement of the Mn complex and its protein–water environment (~ 350 ns), which may serve a dual role: (1) The Mn‐ complex entity is prepared for the subsequent proton removal and electron transfer by formation of an intermediate state of specific (but still unknown) atomic structure. (2) Formation of the structural intermediate is associated (necessarily) with energetic relaxation and thus stabilization of YZox so that energy losses by charge recombination with the QA− anion radical are minimized. The intermediate formed within about 350 ns after YZox formation in the S2-state is discussed in the context of two recent models of the S2 = > S3 transition of the water oxidation cycle. This article is part of a Special Issue entitled: Photosynthesis Research for Sustainability: From Natural to Artificial

    Denoising OCT Images Using Steered Mixture of Experts with Multi-Model Inference

    Full text link
    In Optical Coherence Tomography (OCT), speckle noise significantly hampers image quality, affecting diagnostic accuracy. Current methods, including traditional filtering and deep learning techniques, have limitations in noise reduction and detail preservation. Addressing these challenges, this study introduces a novel denoising algorithm, Block-Matching Steered-Mixture of Experts with Multi-Model Inference and Autoencoder (BM-SMoE-AE). This method combines block-matched implementation of the SMoE algorithm with an enhanced autoencoder architecture, offering efficient speckle noise reduction while retaining critical image details. Our method stands out by providing improved edge definition and reduced processing time. Comparative analysis with existing denoising techniques demonstrates the superior performance of BM-SMoE-AE in maintaining image integrity and enhancing OCT image usability for medical diagnostics.Comment: This submission contains 10 pages and 4 figures. It was presented at the 2024 SPIE Photonics West, held in San Francisco. The paper details advancements in photonics applications related to healthcare and includes supplementary material with additional datasets for revie

    Blip10000: a social video dataset containing SPUG content for tagging and retrieval

    Get PDF
    The increasing amount of digital multimedia content available is inspiring potential new types of user interaction with video data. Users want to easilyfind the content by searching and browsing. For this reason, techniques are needed that allow automatic categorisation, searching the content and linking to related information. In this work, we present a dataset that contains comprehensive semi-professional user generated (SPUG) content, including audiovisual content, user-contributed metadata, automatic speech recognition transcripts, automatic shot boundary les, and social information for multiple `social levels'. We describe the principal characteristics of this dataset and present results that have been achieved on different tasks
    • 

    corecore